(multiple) Add roles for backup/restore functionality#3886
Conversation
|
Build failed (check pipeline). Post https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/53bd5aeecb3c49f2a2b055441269b372 ✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 19m 39s |
|
Build failed (check pipeline). Post https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/bdf74b0181114163974bebdac49d4001 ✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 14m 41s |
|
recheck Zuul seems to be stuck |
|
Build failed (check pipeline). Post ✔️ openstack-k8s-operators-content-provider SUCCESS in 12m 27s |
|
Build failed (check pipeline). Post ✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 04m 50s |
|
recheck |
|
Build failed (check pipeline). Post ✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 37m 00s |
|
recheck |
|
Build failed (check pipeline). Post ✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 35m 46s |
|
recheck |
|
Build failed (check pipeline). Post ✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 38m 34s |
|
recheck |
|
Role prefix job status is FAILED while that should be green. Currently the job looks for just last commit but checks all the files that the PR contain. Fixing at: #3903 |
|
recheck |
|
Once rebase, the CI failure should be fixed. |
|
@evallesp CI passed. Are we good to get this landed? we'd need it for the next FR |
|
rebased |
| dest: "{{ _deploy_minio_rendered_dir.path }}/minio.yaml" | ||
| mode: "0644" | ||
|
|
||
| - name: Apply MinIO manifests |
There was a problem hiding this comment.
(blocking) suggestion: I'd rather go by using: kubernetes.core.k8s instead of shell.
| @@ -0,0 +1,74 @@ | |||
| --- | |||
There was a problem hiding this comment.
(non-blocking) suggestion: Marking as non-blocking as it can be done in a following PR. But we need here a README.
| changed_when: true | ||
|
|
||
| - name: Wait for Velero pod to be ready | ||
| ansible.builtin.shell: | |
There was a problem hiding this comment.
(blocking) suggestion: let's use: kubernetes.core.k8s.
(non-blocking) suggestion: I'd go by checking first if pod > 0.
| delay: 10 | ||
| until: _operator_wait.rc == 0 | ||
|
|
||
| - name: Create cloud credentials secret |
There was a problem hiding this comment.
(blocking) concern: I'm unsure here. At least I see it's important adding no_log: true
| when: cifmw_openshift_adp_enable_node_agent | bool | ||
|
|
||
| - name: Get OADP pods | ||
| ansible.builtin.shell: | |
There was a problem hiding this comment.
(blocking) suggestion: let's use: kubernetes.core.k8s.
| # VolumeSnapshotClass for CSI snapshots | ||
| # ======================================== | ||
| - name: Check for existing VolumeSnapshotClass | ||
| ansible.builtin.shell: | |
There was a problem hiding this comment.
(blocking) suggestion: let's use: kubernetes.core.k8s.
| changed_when: true | ||
|
|
||
| - name: Create Subscription for OADP operator | ||
| ansible.builtin.shell: | |
There was a problem hiding this comment.
(blocking) suggestion: let's use: kubernetes.core.k8s.
| changed_when: true | ||
|
|
||
| - name: Create OperatorGroup for OADP | ||
| ansible.builtin.shell: | |
There was a problem hiding this comment.
(blocking) suggestion: let's use: kubernetes.core.k8s.
| - "Node Agent (Kopia): {{ cifmw_openshift_adp_enable_node_agent }}" | ||
|
|
||
| - name: Create OADP namespace | ||
| ansible.builtin.shell: | |
There was a problem hiding this comment.
(blocking) suggestion: let's use: kubernetes.core.k8s.
| delay: 10 | ||
| until: _velero_wait.rc == 0 | ||
|
|
||
| - name: Wait for node-agent pods to be ready |
There was a problem hiding this comment.
(blocking) suggestion: Let's remove failed_when: false
(non-blocking) suggestion: I'd go by checking first if pod > 0.
| register: _s3_api_url | ||
| changed_when: false | ||
|
|
||
| - name: Create DataProtectionApplication |
There was a problem hiding this comment.
(blocking) suggestion: let's use: kubernetes.core.k8s.
Deploy MinIO as a lightweight S3-compatible object store for use as the Velero backup target in development and CI environments. Signed-off-by: Andrew Bays <abays@redhat.com> Signed-off-by: Martin Schuppert <mschuppert@redhat.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Install and configure the OADP (OpenShift API for Data Protection) operator with an S3-compatible storage backend, create the DataProtectionApplication CR, set up VolumeSnapshotClass for CSI snapshots, and verify the BackupStorageLocation is available. Signed-off-by: Andrew Bays <abays@redhat.com> Signed-off-by: Martin Schuppert <mschuppert@redhat.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
evallesp
left a comment
There was a problem hiding this comment.
In general looks good, but I'd try to move away .shell to kubernetes.k8s.* when possible.
Also I see some changed_when: true that might be checking a when clause.
| for backup/restore. Without it, user-provided resources (e.g. osp-secret) | ||
| will not be restored. | ||
| Create an OpenStackBackupConfig CR before running backup. | ||
| when: _backupconfig_check.rc != 0 or _backupconfig_check.stdout == "" |
There was a problem hiding this comment.
(non-blocking) suggestion: I think is ok. Model review said we might want also check for .status.conditions
| @@ -0,0 +1,14 @@ | |||
| --- | |||
There was a problem hiding this comment.
(non-blocking) question: is this name intended to be 06a?
| - name: Create OADP PVC backup | ||
| ansible.builtin.shell: | | ||
| oc apply -f {{ _cifmw_backup_restore_rendered_dir.path }}/backup-pvcs.yaml | ||
| changed_when: true |
There was a problem hiding this comment.
(non-blocking) suggestion: I think we should add a better when clause here by checking oc apply output.
| - name: Create OADP resources backup | ||
| ansible.builtin.shell: | | ||
| oc apply -f {{ _cifmw_backup_restore_rendered_dir.path }}/backup-resources.yaml | ||
| changed_when: true |
There was a problem hiding this comment.
(non-blocking) suggestion: I think we should add a better when clause here by checking oc apply output.
| - name: Delete DataPlaneDeployment CRs | ||
| ansible.builtin.shell: | | ||
| oc delete openstackdataplanedeployment --all -n {{ cifmw_backup_restore_namespace }} | ||
| changed_when: true |
There was a problem hiding this comment.
(non-blocking) suggestion: I think we should add a better when clause here by checking oc apply output. (For all Delete here)
| when: cifmw_backup_restore_cleanup_dataplane | bool | ||
|
|
||
| - name: Delete DataPlaneNodeSet CRs | ||
| ansible.builtin.shell: | |
There was a problem hiding this comment.
(blocking) suggestion: DYT it's possible to move away from shell in the deletes?
There was a problem hiding this comment.
Changed many of them based on agent's recommendation. There are some that still use shell. If you'd like the rest converted, I will look into it.
Orchestrate backup, restore, and cleanup of OpenStack control plane and data plane resources, including Galera database dumps, Velero CSI volume snapshots, and ordered multi-phase restore sequences. Also adds playbooks (backup_restore.yaml) and integrates backup and restore into the post-deployment pipeline. Signed-off-by: Andrew Bays <abays@redhat.com> Signed-off-by: Martin Schuppert <mschuppert@redhat.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Martin Schuppert <mschuppert@redhat.com>
Add three new Ansible roles for OpenStack on OpenShift backup and
restore using OADP (OpenShift API for Data Protection) and Velero:
cifmw_backup_restore: orchestrates backup, restore, and cleanup of OpenStack control plane and data plane resources, including Galera database dumps, Velero CSI volume snapshots, and ordered multi-phase restore sequences.
openshift_adp: installs and configures the OADP operator with an S3-compatible storage backend, creates the DataProtectionApplication CR, sets up VolumeSnapshotClass for CSI snapshots, and verifies the BackupStorageLocation is available.
deploy_minio: deploys MinIO as a lightweight S3-compatible object store for use as the Velero backup target in development and CI environments.
Also adds playbooks (backup_restore.yaml, backup_restore_tasks.yaml)
to integrate backup and restore into the post-deployment pipeline.
Jira: https://redhat.atlassian.net/browse/OSPRH-22913
Jira: https://redhat.atlassian.net/browse/OSPRH-29819
Jira: https://redhat.atlassian.net/browse/OSPRH-30021
Signed-off-by: Andrew Bays abays@redhat.com
Signed-off-by: Martin Schuppert mschuppert@redhat.com